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Abstract 

Sensitivity analysis of a numerical model, for instance simulating physical phenomena, is useful to 
quantify the influence of the inputs on the model responses. This paper proposes a new sensitivity index, 
based upon the modification of the probability density function (pdf) of the random inputs, when the 
quantity of interest is a failure probability (probability that a model output exceeds a given threshold). 
An input is considered influential if the input pdf modification leads to a broad change in the failure 
probability. These sensitivity indices can be computed using the sole set of simulations that has already 
been used to estimate the failure probability, thus limiting the number of calls to the numerical model. 
In the case of a Monte Carlo sample, asymptotical properties of the indices are derived. Based on 
KuUback-Leibler divergence, several types of input perturbations are introduced. The relevance of this 
new sensitivity analysis method is analysed through three case studies. 



1 Introduction 

In the context of structural reliability, computer codes are used in order to assess the safety of industrial 
systems relying on complex physical phenomena. For instance, an electric operator would like to predict the 
height of a potential river flood in order to determine the height of a dyke preventing any disaster. In this 
example, the computer code (simulating the hydraulic model) has some uncertain input variables (flow rate, 
river length, water height, etc.), that are modelled by random variables. In this paper, the computer code 
is a "black-box" deterministic numerical model and one of its output is considered. Due to the randomness 
of the model inputs, this output is a random variable more or less sensitive to the uncertainty of the input 
variables. 

Sensitivity analysis (SA) is a tool used to explore, understand and (partially) validate computer codes. 
It aims at explaining the outputs regarding the input uncertainties ([13]). The definition of SA differs from 
fields and authors. We use the "global SA" definition given by Saltelli et al. [M] wherein the whole variation 
range of the inputs is considered. The application of such an approach can be model simplification (by 
removing irrelevant modelling elements), input variables ranking or research prioritization. There is a wide 
range of SA techniques, regarding what type of problem the experimenter faces with ([51). For instance, 
screening methods are to be applied when there is a large number of inputs, and few models assumptions. 
For a quantitative point of view, the most popular techniques are variance-based methods, based upon the 
functional Hoeffding variance decomposition [1^ and the so-called Sobol' indices (fl4|). 

It should be noted that most SA methods focus on real-valued continuous numerical output variables. 
When the output is a binary value (e.g. when the numerical model returns "faulty system" or "safe system"), 
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SA techniques are underdeveloped. Some basic techniques can be quoted, such as Monte-Carlo filtering ([H]) 
which consists in measuring differences between a "safe" sample and a "faulty" sample via standard statistical 
tests. In a different scientific field, the reliability index resulting from the First or Second Order Reliability 
Methods (FORM/SORM, [5]) can also be used to classify the impact of the inputs on the failure probability. 
More recent works give methods combining always the two objectives: estimating a failure probability and 
assessing the influence of the input uncertainty on the failure probability ([TTl [T^). 

In this paper, a real-valued numerical model denoted by G : R'* — ?> R is considered. This model may 
further be called the "failure function". In practice, each run of G can be CPU time consuming. We are 
interested in the event G(X) < (system failure) and in the complementary event G(X) > (system safe 
mode). X = {Xi, ...^Xd)^ is a d-dimensional continuous random variable whose joint probability density 
function (pdf) is denoted /. For j = 1, • • • , d, let /i denotes the distribution of Xi (the marginal pdf). We 
make the assumption that all components of X are independent. The quantity of interest is the system failure 
probability: 

= y l{G(x)<o}/(x)dx. 

The aim of this work is the quantification of the infiuence of each variable Xi on this probability, by using 
the same set of calculations that have been used for the estimation of the failure probability. 

In most cited works, sensitivity indices for failure probabilities were defined in strong correspondence with 
a given method of estimation (e.g. [9j[T2]), and their interpretation is consequently limited, as stressed in 
[To]. To answer to genericity concerns expressed by these authors, this article first aims at defining sensivity 
indices that have more intrinsic relevance (Section [2]). Nonetheless, they have to be estimated in practice in 
function of the method. For simplicity reasons, a classical Monte Carlo framework is considered to estimate P 
and the indices. It is also useful to derive the theoretical properties of the estimators of the indices. Pursuing 
the same idea of offering extended tools of sensitivity analysis. Section |3] focuses on generic strategies of input 
perturbation based upon maximum entropy rules. The behaviour of the indices is examined in Section |4] 
through numerical simulations in various complexity settings, involving toy examples and a realistic case- 
study. Comparisons with two reference methods (FORM indices and Sobol' indices) highlight the relevance 
of the new indices in most situations. The main advantages and remaining issues are finally discussed in the 
last section of the article. That introduces avenues for future research. 



2 Definition, estimation and properties of a sensitivity index 



Given a unidimensional input variable Xi with pdf fi and some perturbation parameter 5 lying in a given 
subset of R, let call Xis ~ fis the corresponding perturbed random input. Accordingly, the failure probability 
becomes 

, fz5{Xi) 



(1) 



where Xi is the component of the vector x. Independently of the mechanism chosen for the perturbation 
(see next Section for proposals), a good sensitivity index Sis should have intuitive features that make it 
appealing to reliability engineers and decision-makers. We believe that the following proposal can fulfil these 
requirements: 



P 



1 



1 



P 

pTs 



P ■ l{Pii>P} + PiS ■ 1{P,5<P} 



Firstly, Sis = if Pis = P, as expected if Xi is a non-influential variable or if 6 expresses a negligible pertur- 
bation. Secondly, the sign of Sis indicates how the perturbation impacts the failure probability qualitatively. 
It highlights the situations when Pis > P amounts to determining if the remaining (epistemic) uncertainty 
on the modelling Xi ^ fi can increase the failure risk and therefore should be more accurately analysed. 
Conversely, P can be interpreted as a conservative assessment of the failure probability, robust to pertur- 
bations on Xi, if Pis < P. In such a case, deeper modelling studies on Xi appear less essential. Thirdly, 
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given its sign the absolute value of Sis has simple interpretation and provides a level of the conservatism or 
non-conservatism induced by the perturbation: a value of a > for the index means that Pig = (1 + a)P. If 
S.,s - -a < then P,s = (1/(1 + |a|))P. 

The postulated ability of Sis to enlighten the sensitivity of P to input perturbations must be tested 
in concrete cases, when an estimator P/v of P can be computed using an already available design of N 
numerical experiments. In this paper, N is assumed to be large enough such that statistical estimation 
stands within the framework of asymptotic theory. Besides, we assume for simplicity a standard Monte Carlo 
design of experiments, according to which P/v = J2n=i l{G(x")<o} where the x^, • • • , are independent 
realisations of X. The strong Law of Large Numbers (LLN) and the Central Limit Theorem (CLT) ensure 
that for almost all realisations Pn > P and 

N^oo 



^N/[Pil-P)]{PN~P) AA(0,1). 

A*— >-oo 

An interest of the Monte Carlo framework is that Pis can be consistently estimated without new calls to G, 
through a "reverse" importance sampling mechanism: 



This property holds in the more general case when P is originally estimated by importance sampling rather 
than simple Monte Carlo, which is more appealing in contexts when G is time-consuming [3J [7]. This 
generalization is discussed further in the text (Section [5]). 

Lemma 2.1 Assume the usual conditions 
(i) Supp{fis) C Supp{fi), 

JSuppifi) Ji[x) 

then PiSN > PiS o-nd VN<y~g\j (PiSN ^ Pis] — ^ — > A/'(0, 1). The exact expression of aZi, is given in 

T V— >oo V / AT— foo 

AvvendixV^ eauation \l(A It can he consistently estimated by 



The proof of this Lemma is given in Appendix \A.l\ 

The asymptotic properties of any estimator of Sis will depend on the correlation between and PiSN ■ The 
next proposition summarizes the features of the joint asymptotic distribution of both estimators. 



Proposition 2.1 Under assumptions (i) and (ii) of Lemma \2.1 

Vn 



Pn \ P 

P^SN J V 



^ >AA2(0,E,5) 



Af-s-oo 

where Y,is is given in AvvendixV^ eauation \ll\ and can he consistently estimated hy 

y _ ( Pn{1-Pn) PrSNil-pN) 

\Pun{1~Pn) 
The proof of this Proposition is given in Appendix \A.2i 
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Given {Pn , PiSn), the plugging estimator for Sis is 



5i 



i<5Af 



PiSN 



- 1 



N 



^{P^5N>PN} 



N 



PiSN 



■^{PunKPn}- 



(2) 



In corollary of Proposition l2.11 applying the continuous- mapping theorem to the continuous function s{x, y) — 



1 



SiSN converges a.s. to Sis- 



Proposition 2.2 Assume that assumptions (i) and (ii) of Lemma \2.1\ hold and further that P ^ Pig. We 
have 



SiSN ~ SiS 



d s d s 

with d = (tt P^s), Pis))"^ for x^y, and 

ox ay 

ds 



N- 



■M{0,d^T.d) 



(3) 



— (2:,y) = -yl[y>,}/x - l^y^^^, 

ds 1 2 

— (a;,y) ^ -l{y>x} + xl{y<^}/y . 



The following CLT results from Theorem 3.1 in 116]. Notice that it is also the case when P = Pig. Indeed, 
one has for x* 7^ ; 

lim Vs(a;, y)= lim Vs(x,y) = 

y > X y < X 





3 Methodologies of input perturbation 

Our sensitivity analysis method requires to define a perturbation for each input. In general, and especially 
in preliminary reliability studies, there is no prior rule allowing to elicit a specialized perturbation for each 
input variable. When conducting such an analysis, it is advisable to propose one or several fair methodologies 
for perturbing the inputs. 

More precisely, we suggest to define a perturbed input density fig as the closest distribution to the original 
fi in the entropy sense and under some constraints of perturbation. Information-theoretical arguments ([5]) 
led us to choose the KuUback-Leiblcr (KL) divergence between fis and fi as a measure of the discrepancy to 
minimize under those constraints. Recall that between two pdf p and q we have 

KL{p,q) = p(y)\ogP^dy if log ^ G L'ip{y)dy). (4) 

q{y) liy) 

Let i = 1, • • • , d, the constraints are linear as functional of the modified density /,„od: 

J gk{xi)f^ad{xi)dxi = Sk.i {k = l---K). (5) 

Here, for k = 1, • • • ,K, are given functions and 5k^i are given real. These quantities will lead to a 
perturbation of the original density. The modified density fig considered in our work is: 

fig= argmin KL{f„,,a,fi) (6) 
/modldSJ holds 

and the result takes an explicit form ([6^) given in the following proposition. 
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Proposition 3.1 Let us define, for X = (Ai, • • • , Xk)^ G R^, 



ipi{X) = log / /,(x)exp 



K 



.fe=l 



dx 



(7) 



where the last integral can be finite or infinite (in this last case ipi{X) = +00). Further, set dom ipi — {X 
R^\tjji{X) < +CX)}. Assume that there exists at least one pdf fm satisfying and that dom tpi is an open 
set. Then, there exists a unique X* such that the solution of the minimisation problem 0) is 



fis{xi) = fi{xi)eTip 



K 



.fc=i 



The theoretical technique to compute A is provided in appendix |B] Hereby are presented two kinds of 
perturbations used further on. 



Mean twisting The first moment is often used to parametrize a distribution. Thus the first perturbation 
presented here is a mean shift, that is expressed with a single constraint: 

j xj^^^{xi)dxi = 5^ . (8) 

In term of SA, this perturbation should be used when the user wants to understand the sensitivity of the 
inputs to a mean shift - that is to say "what if the mean of input were Si instead of E [X^]". 

Proposition 3.2 Considering the constraint under the assumptions of Proposition [STl] the expression 
of the optimal perturbed density is 

fiS,{xi) = exp(A*Xi - 'ip,{\*))f,{xi) 
where A* is such that equation (0) holds. 

It can also be noted that equation ([7]) becomes 

= log J fi{xi)exp{Xxi)dxi = log{Mx,{Xj) 

where Mxi{u) is the moment generating function (mgf) of the i—th input. With this notation, A* is such 
that 

J Xi exp{X* Xi - log {Mx,{X*})) ft{xi)dxi = (5^ , 

which leads to 

J Xiexp{X*Xi) fi{xi)dx = 5iMx,{X*) . 

This equation can be simplified to: 

M'xM*) 
Mx.(A*) 

This equation may be easy to solve when one has the expression of the mgf of the input Xi and of its 
derivative. 



Variance twisting In some cases, the mean of an input may not be the main source of uncertainty, but 
rather the second moment. This case may be treated considering a couple of constraints. The perturbation 
presented is a variance shift, therefore the set of constraints is: 

f / Xif„,„i{xi)dxi = E [Xi] , 

\ J xjf,^aM)dx, = Ko„« + E . ^ 

The perturbed distribution has the same expectation E [Xi] as the original one and a perturbed variance 



5 



Proposition 3.3 Under the assumptions of Proposition \3.1[ for the constraint (0), the expression of the 
optimal perturbed density is: 

fiS,{xi) = exp{\lx + \*2x'^ ~ ^pi{\*))f^{xi) 
where XI and A2 are so that equation (0) holds. 

Proposition 3.4 Let consider that the original random variable Xi is distributed according to a Natural 
Exponential Family (NEF). Recalls that a NEF's pdf is of the form: 

fifiixi) = b(xi) exp [xiO - ri{e)] 

where is a parameter from a parametric space Q, b{.) is a function that depends only of Xi and 

r]{d) = log J b{x) exp [xi6] dxt 

is the cumulant distribution function. Considering the assumptions of Provosition COl then it is straightfor- 
ward by theorem 3. 1 in ^6] that optimal pdfs proposed respectively in Proposition \3.2\ and Proposition \3.3\ are 
also distributed according to a NEF. The details of computation are given for a mean shift and a variance 
shift in Appendix D. 



4 Numerical experiments 

In this Seetion, the methodology is tested on two academic cases and a more realistic industrial code. The 
new indices are compared to the results of two reference methods, FORM indices (or Importance Factors, IF) 
and Sobol' indices (SI). Both are computed using the methodologies given in [5] and |T5], respectively. To 
assess the reproducibility of the estimation of the SI, a sample of 10^ points is used, and 50 replications are 
made. Thus all the estimations of the SI are the mean of the obtained values and the coefficient of variation 
(CV) of the index is provided. One should notice that the SI are applied on the indicator of the failure 
function 1{g(x)<o}- Following the definition of IF and SI, those indices lies in [0, 1]. 



4.1 Hyperplane failure surface 

For the first example, X is set to be a 4— dimensional vector, with d = A independent marginal distributions 
normally distributed with parameters and 1. Therefore fxi ~ A/'(0, 1) for i = 1, ..,4. The failure function 
is defined as: 

4 

G(X) ^k-J2 ^rX, 



where k and a = (ai, 02, 03, 04) are the parameters of the model. For this numerical example, parameters are 
set with values fc = 16 and a = (1, —6, 4, 0). An explicit expression for P can be given since G(X) behaves 



like a Gaussian distribution with mean k and standard deviation 



A ^ of . Therefore: 

\ i=i 



P = -fc/ 



\ 



-16 



0.014 



where 0(.) is the standard normal cumulative distribution function. 

It is expected that the influence of Xi on P uniquely depends on | | . The greater the absolute value 
of the coefficient is, the bigger the expected influence is. The aim of choosing one non-influential (dummy) 
variable (because 04 = 0) is to assess if the SA methods can identify this variable as non-influential on 
the failure probability. 
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4.1.1 FORM 



In this ideal hyperplane failure surface case, FORM performs well as expected [9] by providing an approx- 
imated value PpoRM = 0.01398. The importance factors, given in Table [1] provide an accurate variable 
ranking for the failure function, given the factors. 



Variable 




X2 




X4 


Importance factor 


0.018 


0.679 


0.302 






Table 1: Importance factors for hyperplane function 



4.1.2 Sobol' indices 

The first-order and total indices are displayed in Table l4T!2l The interpretation of the results is that X2 and 
concentrate most of the variance of the indicator function. At first order, 25% of its variance is explained 
by X2 without any interaction. It should be noted that the total index for X4 is null, assessing that this 
variable does not impact the failure probability. The CV of the total indices estimators are small, meaning 
that this method is reproducible and that 10^ points are enough to estimate in an efficient way the indices 
Sri- On the other hand, some CV values for low mean first order indices are quite high. The conclusion of 
this result is that the method correctly estimates high indices but estimates poorly the indices close to 0. 
On the other hand, the relevant information is that the index is close to 0. Thus this situation may not be 
a problem. 



Sobol' Index 


Si 


S2 


S3 


Si 


Sti 


St2 


St3 


St4 


Mean 


0.0017 


0.2575 


0.0544 


9.45.10-^ 


0.1984 


0.9397 


0.7256 





C.o.v. 


1.5854 


0.04826 


0.1336 


27.4 


0.012 


0.0069 


0.013 






Table 2: Sobol' indices for hyperplane function 



4.1.3 Density modification based reliability indices 

The method presented throughout this article is applied on the hyperplane function. As explained in Section 
[3l several ways to perturb the input distributions exist. For this case, we choose to apply first a mean 
twisting, then a variance twisting with fixed mean. A simple calculus gives that the perturbed pdf are 
Gaussian, respectively with the constraint mean and variance 1 for the mean twisting perturbation (see 
Table [7]), and with mean and the constraint variance for the variance twisting perturbation. Thus, the 
MC estimation gives P — 0.01446. For the mean twisting (see (|S])), the variation range chosen for S is from 
— 1 to 1 with 40 points, reminding that S — cannot be considered as a perturbation. For the variance 
twisting (see ©), the variation range chosen for Vp^^ is from 1/20 to 3 with 28 points, where Vp„ = 1 is not 
a perturbation. The estimated indices are plotted respectively in Figure [1] for mean twisting and in Figure [5] 
for variance twisting. 95% confidence intervals are plotted around the indices. 

Mean perturbation indices The indices Sis behave in a monotonic way given the importance of the 
perturbation. The slope at the origin is directly related to the value of a;. For infiuential variables {X2 and 
ATs), the increasing or the decreasing is faster than linear, whereas the curve seems linear for the slightly 
influential variable (Xi). Modifying the mean of amplitude S positive slightly rises the failure probability 
for Xi , highly decreases it for X2 and increases it for X3 (Figure [1]) . The effects are reversed with similar 
amplitude for negative S. It can be seen that X4 has no impact on the failure probability for any perturbation. 
Those results are consistent with the expression of the failure function. One can see that the confidence 
intervals (CI) associated to X2 and X3 are fairly well separated, except for the small absolute value of S. On 
the other hand, the CI associated to Xi and X4 are not separated until absolute value of S higher than 0.6. 
The conclusion of the observation of the CI is that one cannot differentiate the impact of variable Xi and 
X4 unless a broad change of the mean occurs. 
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s 



Figure 1: Estimated indices Sis f^or hyperplane function with a mean twisting 

Variance perturbation indices Increasing the variance of input X2 and X3 increases the failure proba- 
bihty, whereas it is the opposite when decreasing the variance (Figure [2]). Modifying the variance of Xi and 
X4 have no effect on the failure probability. The increasing of the indices is linear for X2 and X3 , and the 
decreasing of the indices is faster than linear, especially for X2- Considering the CI, one can see that they 
are well separated for variable X2 and X3, assessing the relative importance of these variables. On the other 
hand, as the CI associated to Xi and X4 are not separated and contain 0. 

4.2 Thresholded Ishigami function 

The Ishigami function is a common test case in SA since it has a complex expression, with interactions 
between the variables. A modified version of the Ishigami function will be considered in this paper. One has: 

G(X) = sin {Xi) + 7 sin (Xa)^ + 0.1X1 sin {Xi) + 7 

where X is a 3— dimensional vector of independent marginals uniformly distributed on [— 7r,7r] . In Figure [3l 
the failure points (where G'(x) < 0) are plotted in a 3-d scatterplot. 

There are 614 failure points on a MC sample of 10^ points therefore the failure probability here is roughly 
P = 6.14.10^'^. The complex repartition of the failure points can be noticed. Those points lay in a zone 
defined by the negative values of Xi, the extremal and mean values of X2 (around — tt, and tt), and the 
extremal values of X3 (around — tt and tt). 

4.2.1 FORM 

The algorithm FORM converges to an incoherent design point (6.03,0.1,0) in 50 function calls, giving an 
approximate probability of Pform = 0.54. The importance factors are displayed in Table [3l The bad 
performance of FORM is expected given that the failure domain consists in six separate domains and that 
the function is highly oscillant, leading to optimization difficulties. The design point is aberrant, thus the 
FORM results of SA are incorrect. 

4.2.2 Sobol' indices 

The first-order and total indices are displayed in Table S) The small values of first order indices show that 
no variable has impact on the variance of the indicator of failure on its own. The three total indices have 
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0,5 1.0 1.5 2.0 2,5 3.0 

Figure 2: Estimated indices S'i.Vp^^ for hyperplane function with a variance twisting 



Variable 


Xi 


X2 




Importance factor 




1 






Table 3: Importance factors for Ishigami function 

relatively high and similar values. This states that all the variables highly interact with each other to cause 
system failure. The SI method is thus non-discriminant in this case. The low CV shows that the method is 
reproducible. 



Sobol' Index 


Si 


S2 


53 


Sti 


St2 


Stv, 


Mean 


0.0234 


0.0099 


0.0667 


0.8158 


0.6758 


0.9299 


C.o.v. 


0.0072 


0.0051 


0.0095 


0.0156 


0.0216 


0.0094 



Table 4: Sobol' indices for Ishigami function 



4.2.3 Density modification based reliability indices 

The method presented throughout this article is applied on the thresholded Ishigami function. As for the 
hyperplane test case, a mean twisting and a variance twisting are applied. The modified distribution when 
a mean shift is applied on a uniform distribution is given in Table [T] The modified pdf when shifting the 
variance and keeping the same expectation is proportional to a truncated Gaussian when decreasing the 
variance. When increasing the variance, the perturbed distribution is a symmetrical distribution with 2 
modes close to the endpoints of the support. As previously, the same MC sample of size 10^ (also used to 
produce Figure [3]) is used to estimate the indices with both perturbations. For the mean twisting (see ([8])), 
the variation range chosen for ^ is —3 to 3 with 60 points - numerical consideration forbidding to choose a 
shifted mean closer to the endpoints. For variance twisting, the variation range chosen for V-^^^ is 1 to 5 with 
40 points. Let us recall that the original variance is Var[Xi] = 7r^/3 ~ 3.29. The estimated indices are plotted 
respectively in Figure |4] for mean twisting and in Figure [5] for variance twisting. 
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Figure 3: Ishigami failure points from a MC sample 
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Figure 4: Estimated indices Sis for the thresholded Ishigami function with a mean twisting 



Mean perturbation indices A perturbation of the mean for X2 and will increase the failure proba- 
bility, though the impact for the same mean perturbation is stronger for X3 (53,-3 and 53,3 approximately 
equal respectively 9.5 and 10, Figure SJ. On the other hand, the indices concerning Xi show that a mean 
shift between —1 and —2 increases the failure probability, whereas an augmentation of the mean or a large 
diminution strongly diminishes the failure probability (S'1,3 approximatively equals —7.10^^). Therefore, Fig- 
ure |4] leads to two conclusions. First, the failure probability can be strongly reduced when shifting the mean 
of the first Xi (this is also provided by Figure |3] wherein all failure points have a negative value of Xi). 
Second, any change in the mean for X2 or X3 will lead to an increase of the failure probability. The CI are 
well separated, except in the —1 to 1 zone. One can notice that the CI associated to X2 contains between 
values of 6 from —1.5 to 1.5, thus the associated indices might be null in these case. This has to be taken 
into account when assessing the relative importance of X2 ■ 



Variance perturbation indices Figure [5] (upper) shows that a change in the variance has little effect 
on X2 and Xi, though the change is of opposite effect on the failure probability. However, considering 
that the indices S'2,Vpcr,i and S'i,yp„_i he between —0.4 and 0.4, one can conclude that the variance of theses 
variables are not of great influence on the failure probability. On the other hand. Figure [5] (lower) shows 
that any reduction of Var [X^] strongly decreases the failure probability, and that an increase of the variance 
slightly increases the failure probability. This is relevant with the expression of the failure surface, as X^ is 
fourth powered and multiplied by the sinus of Xi. A variance decrease as formulated gives a distribution 
concentrated around 0. Decreasing Var [X3] shrinks the concerned term in G(X). Therefore it reduces the 
failure probability. The CI associated to V3 are broadly separated from the others. On the other hand, the 
CI associated to Xi and X2 overlap when the fixed variance goes from 1.5. to 4. It is thus not possible to 
conclude with certainty on the difference of impact of those variables. 



4.3 Industrial case : flood case 

The goal of this test case is to assess the risk of a flood over a dyke for the safety of industrial installations. 
This comes down to model the height of a flood. Given the uncertainty upon numerous physical parameters, 
the uncertainty approach is used and unknown parameters are modelled by random variables. From a 
simplification of the Saint- Venant equation, a flood risk model is obtained. The quantity of interest is the 
difference between the height of the dyke and the height of water. If this quantity is negative, the installation 
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is flooded; this is the failure event that is considered. Several quantities will be denoted as follows: Q the 
flow rate, L the watercourse section length studied, B the watercourse width. Kg the watercourse bed friction 
coefficient (also called Strickler coefiicient), Zm and respectively the upstream and downstream bottom 
watercourse height above sea level and the dyke height measured from the bottom of the watercourse 
bed. The water height model is expressed as: 



Therefore the following quantity is considered: 

G = Hd-{Z,, + H). 

Among the model inputs, the choice is made that the following variables are known precisely: L = 5000 
(m), B = 300 (m), Hd — 58 (m), and the following are considered to be random. Q (m^.s^^) follows a 
positively truncated Gumbel distribution of parameters a = 1013 and b — 558 with a minimum value of 0. 
Kg (m^/'^s^^) follows a truncated Gaussian distribution of parameters fi — 30 and a — 7.5, with a minimum 
value of 1. Zy (m) follows a triangular distribution with minimum 49, mode 50 and maximum 51. Z^i (m) 
follows a triangular distribution with minimum 54, mode 55 and maximum 56. 

4.3.1 FORM 

The algorithm FORM converges to a design point (1.72,-2.70,0.55,-0.18) in 52 function calls, giving an 
approximate probability of Pform — 5.8. 10^^. The importance factors are displayed in Table[5] 



Variable 


Q 


Ks 






Importance factor 


0.246 


0.725 


0.026 


0.003 



Table 5: Importance factors for flood case 



FORM assesses that Kg is of extremely high influence, followed by Q that is of medium influence. Z„ has 
a very weak influence and Z^, is negligible. It can be noticed that the estimated failure probability is twice 
as small as the one estimated with crude MC, but remains in the same order of magnitude. 

4.3.2 Sobol' indices 

The first-order and total indices are displayed in Table [S) It can be seen that the estimates of some indices 
are negative despite the fact that Sobol indices are theoretically positive. The estimation can indeed produce 
negative results for values close to 0. 



Sobol Index 


Sq 










StKs 


Stz^ 


^TZm 


Mean 


0.0169 


0.2402 


-7.10"'"'' 


-5.10-* 


0.7447 


0.9782 


0.2684 


0.1062 


C.o.v. 


0.0122 


0.0577 


0.0029 


0.0023 


0.0553 


0.0137 


0.0516 


0.0389 



Tabic 6: Sobol' indices for flood case 



Considering the first order indices, Z^ and Zm are of null influence on their own. Q is considered to 
have a minimal influence (1% of the variance of the indicator function) by itself, and Kg explains 24% of the 
variance on its own. When considering the total indices, it can be noticed that both Zy and Z^ have a weak 
impact on the failure probability. On the other hand, Q has a major influence on the failure probability. 
Kg total index is close to one, therefore Kg explains (with or without any interaction with other variables) 
almost all the variance of the failure function. 
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5. 



Figure 6: Estimated indices Sis for flood case with a mean twisting 
4.3.3 Density modification based reliability indices 

The metliod presented throughout this article is applied on the flood case. Only the mean twisting will be 
applied here. The modified pdf are given in Table [T] One can notice that the different inputs follow various 
distributions (unlike the other examples), thus the question of "equivalent" perturbation arises. It will be 
discussed further in Section [5j Here the choice has been made to shift the mean relatively to the standard 
deviation, hence including the spread of the various inputs in their respective perturbation. So for any input, 
the original distribution is twisted so that the perturbed distribution's mean is the original's one plus 5 times 
its standard deviation, S going from —1 to 1 with 40 points. The 10^ MC sample gives an estimation of the 
failure probability P = 8.6. 10~*. 

Figure [6] assesses that an increasing of the mean of the inputs increases the failure probability slightly for 
Z^v, strongly for Q, and diminishes it slightly for Zm and strongly for Ks- This goes the opposite way when 
decreasing the mean. In terms of absolute modification, Kg and Q are of same magnitude, even if Kg has a 
slightly stronger impact. On the other hand, the effects of mean perturbation on Z,„ and Z^ are negligible. 
The CI associated to Q and Kg are well separated from the others, except in a S = —.3 to .3 zone. The CI 
associated to Z^ and Z,„ overlap, thus even though the indices seem to have different value, it is not possible 
to conclude with certainty about the influence of those variables. 

5 Discussion 

5.1 Conclusion on the method 

The method presented in this paper gives interesting complementary information in addition of traditional 
SA methods applied to a reliability problem. Additionally, it has two advantages: 

• The ability for the user to set the most adapted constraints considering his/her problem, 
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• The MC framework allowing to use previously done function calls, thus limiting the CPU cost of the 
SA, and allowing the user to test several perturbations. 



5.2 Equivalent perturbation 

The question of " equivalent" perturbation arises from cases where all inputs are not identically distributed. 
Indeed, problems may emerge when some inputs are defined on infinite intervals and when other inputs 
are defined on finite intervals (such as uniform distributions). Consider a two-dimensional model with one 
Gaussian distribution and one uniform distribution as inputs. Thus, a mean shift will be a translation for the 
first input, whereas it will lead to a Dirac distribution in one endpoint for the other input. Hence, a mean 
shift cannot be considered as an "equivalent" perturbation. One could think of a "relative mean shift" , which 
seems a fairly good idea. But let one consider a model with two Gaussian inputs of equal variance 1 and of 
mean respectively 1 and 10000. Then, a relative mean shift of 10% will result in Gaussian distributions with 
mean respectively 1.1 and 11000, and still variance 1. This counter-example shows that relative mean shift 
might not be an adequate perturbation in terms of "equivalence". 

5.3 Further work 

Two main avenues are of interest: 

• To adapt the estimator of the indices Sis, in term of variance reduction and of number of function calls. 
Further work will be made with importance sampling methods, and possibly subset methods. The use 
of sequential methods [4 may also be tested, 

• To find a way to perturb " equivalently" several distributions of different natures. A perturbation that 
is not based upon a moment constraint but rather of an entropy constraint might be proposed. The 
differential entropy of a distribution can be seen as a quantification of uncertainty |2J . Thus an example 
of (non-linear) constraint on the entropy can be: 



Yet further computations have to be made to obtain a tractable solution of the KL minimization 
problem under the above constraint. 
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Appendices 
A Proofs 

A.l Proof of Lemma 12.11 

Under assumption (i), we have 

/ l{G(x)<o} I , \ /(x) dx < / fis{xi) dxi = 1. 

JSupp(/,5) Ji[Xi) JSuppifis) 

So that, the strong LLN may be applied to PiSN- Defining 

0-^5 = Var 

one has 

a% = / l{G(x)<o} 1 1 \ \\f]{X]) d^^ Pfs < oo under Condition (ii). 

JSupp(/.) Jt[Xt) .^^ 

Therefore the CLT applies: 

Va^^t^/ [p^sn ~ P^s) A M{0, 1) . 

Under assumption (ii), the strong LLN applies to o-'fgjy. So that, the final result is straightforward using 
Slutsky's lemma. 



1{G(X)<0} 



(10) 
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A. 2 Proof of Proposition 12.11 

First, note that 



PP,t 



-PP^s = E 



PP,s 



iV2 



Ll{G(x")<0}J ^^pry + l{G(x")<0}l{G(xJ)<0} 



71=1 



n=l j=ti 



Mxi) 



-PP.5 



= ]V2 + (A^ - 1) ^^^^5] - PPrS 

= ^ iP.s - PP.s) ■ 

Assuming the conditions under which Lemma 1 is true, the bivariate CLT foUows with 

p(i-p) ns{i~p) 

Ps{l-P) afs 

Each term of this matrix can be consistently estimated, using the results in Lemma 1 and Slutsky's lemma. 



B Computation of Lagrange multipliers 

Let H be the Lagrange function: 

K 

fe=i 

Thus, using the results of [6], we have 

A* = argmin_ff(A). 
The expression of the gradient of H with respect to the j^^ variable is 

fin fVi 

In the same way, the expression of the second derivative of H with respect to the h and the j variables 



f !Jh{x)gj{x)fi{x)exp{J2k=i \k9k{x))dx 



exp -04 (A) 

1 9]{x)fi{x) exp(X;f=i Xk9k{x))dx J gh{x)f^{x) exp{Y,k=i Xk9k{x))dx 



expipi{X) 



exp ipi{X) 



This method has been used in this paper for computing the optimal vector A* when a variance twisting was 
applied. The integral were evaluated with Simpson's rule. 



C Numerical trick to work with truncated distribution 

In the case where a mean twisting is considered on a left truncated distribution, here is presented a tip that 
can help to compute A*. 

The studied trucated variable Yt has distribution /yt- Let us denote Y ^ fy the corresponding non- 
truncated distribution. The trucation occurs for some real value a. This truncation may happen for some 
physical modelling reason. One has: 



fyriy) 



1 



1 - Fia) 



l[a,+oo[(y)/y(2/)- 
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The formal definition of Myt{^) the mfg of Yt for some A is: 



MytW 



/Y(y)exp [Xy]dy. 



Let us recaU that we are looking for A* such as: 

^ _ M^^jX*) _ /+°° yfYjy) exp [Xy] dy 
Myt{X*) fYiy)e^p[Xy]dy' 



(11) 



When the expression does not take a practical form, one can use numerical integration to estimate 
the integral term. Unfortunately, for some heavy tailed distribution (for instance Gumbel distribution) , this 
numerical integration might be complex or not possible. This is due to the multiplication by an exponential of 
y. The following tip helps to aviod such problems. Denoting Afy(A) the mfg of the non-trucated distribution 
for some , one can remark that: 

fy (y) exp [Xy] dy ^ /y (y) exp [Xy] dy + /y (y) exp [Ay] dy 

-oo J —oo J a 

Thus another expression for Myt{X) is: 



Myt{X) = 



1 



l-FY{a) 



My{X)- j fY{y)cxp[Xy]dy 
<y —oo 



The integral term is much smaller in the left heavy tailed distribution case. Therefore the numerical integra- 
tion (for instance using Simpson's method) is much more precise or became possible. 
The same goes for Myj.(A) which has alternative expression: 



M(,y(A) = 
Finally, another form offTTjis: 



1 



Ml^{X)- I y/y(y) cxp [Ay] dy 



l-Fy(a) 

^ My(A)-/:^ y/y(y)exp[Ay]c;y 
My(A)-/:^/y(y)exp[Ay]dy ' 



(12) 



This alternative expression may lead to more precise estimations of A* when My (A) and Afy(A) are 
known (which is the case for most usual distribution) since the integral term are much smaller than in the 
first expression . A reference to this Appendix is made in the summary table [T] 

D Proofs of the NEF properties 

In this Appendix, the details of the calculus for the Proposition 13.41 are detailed. The definition of NEF was 
given in the concerned Proposition. 

NEF specificities : If the original density fi{x) is a NEF, then under a set of K linear constraints on 
/(x), one has : 

f{x) — b{x) exp [x9 — r]{9)] , 

thus : 



fsix) = /(a;) exp 



K 



^Xk9k{x) - ?A(A) 



.fc=i 



The regularization constant from ([7]) can be written as: 



ip{X) = log J b{x) exp 



K 



xO + Xkgk{x) - 77(6*) 



fc=i 



dx 



(13) 



If the integral on[T2]is finite, fs exists and is a density. 
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Mean twisting With a single constraint formulated as in ([5]), ([T^ becames : 
ipW — log J b{x) exp [x6 + \x — 'r]{9)] dx 

= log / b{x) exp [x (6* + A) - r/(6l) + 77(6* + A) - 77(6* + A)] dx 



if 77(0 + A) is well defined. 



{r){0 + A) - 7/(61)) + log 
77(0 + A) - 



exp [x (6* + A) - 77(6* + A)] 



dx 



since 



b{x) exp [x (0 + A) - 77(^? + A)] = fg+x{x) 
with notation from p.4p . is a density of integral 1. Thus 

fs{x) = 6(a;) exp [a;6' - 0(6*)] exp [Aa; - 77(6* + A) + 7/(0)] 
= b{x) exp [a; + A] - 77(^? + A)] = fe+xix) 

Thus the mean twisting of a NEF of CDF 77(.) results in another NEF with mean 77'(6' + A) = (5 (constraint) 
and variance 7/" {6 + X). 

Variance twisting With a single constraint formulated as in (|13p thus the new distribution has density 

fs{x) = b{x) exp [x9 + xXi + x'^X2 - ip{X) - 77(6*)] 

Since A is known or computed, and 9 is also known, one has the variable change z — ^/X^x assuming A2 is 
strict, positive (the variable change is z = \/^X2X if A2 is strict, neg.). Thus, 



fs{x) = fe(— =)exp [z^] exp 

V^2 



exp 



' + Ai) 



V^2 



+ Xi) - ^{X) - V{9) 



c(z) exp 



+ Ai) 



(g + Ai) 

^/x; 



with 



By©, 



c(z) =6(^=)exp[z2] 

V A2 



log / b{x) exp [x9 + xXi + x^X2 — ??(6')] dx 



= [V 



log / 6(— ^)exp [z^] exp 
J V A2 

(e + Xi) 



(g + Ai) 

^/x; 



z — 77(6*) + 7/ 



(g + Ai) 

^fX2 



+ Ai) 

^/x; 



- 77(6*) ) + log / c(z) exp 
-77(0) 



' + Ai) 



-z - 77 



- V 
+ Ai) 

s/x; 



(g + Ai) 

Va; 

dx 



dx 



By (Ull) and dH]), one has 



fs{x) = c(z)exp 



+ Ai) 

Z- ;= f] 



'X2 



(g + Ai) 
^/X'2 



thus the variance twisting of a NEF results in another NEF parametrized by 



(e+Ai) 



E Summary Table with modified distributions for mean shift 
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Original distribution 



Modified distribution 



Modified pdf fa 



Link between A* and S 



NEF(6l) 
Special case of NEF: Af{fi, a) 

Uniform distribution: l^[a,b] 
Left Tr Gaussian A/r (/U, ct, a) 

Triangle T(a, b, c) 
Left Tr Gumbel QrilJ., P, a) 



NEF(e + \*) 

oc truncated exponential 

JVt (/i + o-^ A* , (T, a) iis (xi 



fis(xi) = b{xi) exp [xi [e + A*] - ri{e + A*)] 



exp 



-- exp 



1 ( Xj-ij.—cr'^y y 

2 I <r J 



= exp(a;iA* - xp{X''))f(xi) 
hsixi) = exp{xiX* - ip{X*))f{xi) 



r,'{e + \*) = S 



A* 



(.b-a) 



^(A*b-l)+e^*"(l-A*a) 



_ (a-^)e^'^{b-c) + {b-^)e^'''{c-a) + (c-^)e^''''(a-b) 

e^*"(6-c)+e^*''(c-a)+e^* = (a-6) 
M^{A*)-/^^ H/y (j;) Gxp[A*!;]di/ 
* ~ My{A*)-/°^/y(y)oxp[A*yl<ij/ • 

Afy (A*) = r (1 - /3) exp [AV] 
M(.(A*) = r (1 - /3) exp [A*m] [/X - /3r(°)(l - A*)] 



Table 7: Modified distributions for mean twisting. Note that $(.)is the cdf of the standard normal distribution, and is its pdf. 



